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APSTRACT - * 

When two groups, initially dissimilar, undergo 
different treatments, can subseguent differences be partitioned in 
s:/' h a way that the difference between the two treatments is ^ 
uuiiiased? This is the central problem of this paper, and it is 
confronted by the exa^nation of two levels of inforaation using a 
Follow Through Evaluation. The first inforaation level contiaiins, in 
addition to. outcome variables (achievement tests), information ou 
child characteristics and family background. The second contains ell 
the variables of the first plus three achievement tests given at an 
earlier time. Although many educators believe it is important to use 
a pretest to adjust posttest scores, closer inspection revegil? that 
this process typically explains only a fraction ot the variation on 
posttest scores. Even if pretests substantially explain variations on 
concurrent variables, this does not warrant the conclusion that 
treatment differences based on posttests will be altered by 
additional information. It is concluded that a multifaceted approach 
may reveal the analysis or analyses best suited for a particular 
question. (Author/BJG) 
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The Problem 

When two groups, initially dissimilar, undergo different 
treatments, can subsequent differences be partitioned in such 
a way that the- difference between the two treatments is un- « 
biased? That is, can the treatment: difference be estimated 
free of all other . antecedent and concujfr^t influences? • The 
amswer depends on how much we know about Id^e initial dissimi- 
laritj^. If all the variables that produced . initial differences* 
are known and well measured and the structure of "their relation- 
ship with the outcomes measure is also known for both groups, 
the initial dissimilarity can be totally removed. In educa- 
tional research this condition is rarely, if ever, met. When 
these variables are unknown or unobi^erved, not only the pro'grapi 
but everything else that might have produced diirferences in 
©uteres between the groups is a possible 'cause. 

. This paper deals with two levels of information. The 
first contains, in addition to the outcome variables {the 
achievement tests) , information on child characteristics and 
family background. The second contarins all the variables of 
the first plus three achievement tests given at an earlier time. 

The first level is an example of cross sectional data: a ^ 
number of measurements are taken on a sample of people once. 



Althou^ the background variables in this case vera measured 
prior to the post-test, we treat them as if they were gathered 
simultaneously since they could have been measured at the same 
time. The second level represents longitudinal data: a number 
of measurements are taken o'n a sample of people at "time 1 knd » 
another set of measurements (on possit^y different variables) 
is taken on the sane sample a-t time 2l Does utilizationrof the 
.longitudinal information — in this icase the three initial 
achievement tests — change our conclusions about differences 
among grottps of people on the outcome achievement tests*? 

Many educational researchers believe it is important to 
use pre-tests to adjust post-test sc6res» The measures avail- 
able at the time of post-testing — often sogTo-economic vari- 
ables on the child* s family along with hei*^-^ his personal 
attributes — typically explain only a minor fraction of the 
varia^ioi^ in po§t-test scores, jvddition of pre-tests to the 
concurrent variables usually substantially increases the frac- 
tion of variation that can -be explained. However, it does not 
follow from this that conclusions about treatment differences 
'based upon the post-tests will be altered by the additional 
information. For this to happen the structure of thfe relation- 
ship between' the expanded set of explanatory, "variables and the 
outcome variables must be changed by the introduction of the 
pre-test vai^iables. 



The Problem In the Follow Through Evaluation 

* 

The Follow Through evaluation, like ether evaluations of 
large-scale educational interventions, is not a true experiment 
and thus does hot involve random assignment at any stage. Since 
some child and site characteristics were measured we know that' . 
thele characteristics are confounded with the, models . (treatment) . 
We icnow the extent of confounding with the measured variables 
but without randomization we do not know the extent of con- 
founding with unmeasured variables. • Additionally, treatments 
and non-treatments' were, f9r the most part, poorly defined. 
Many communities, sponsors, schools, etc. had unique features . 
that may well have interacted with treatment or non-treat 'lent. 
Also, many of the events that occur in the four years between 
entrance into and exit from the program are undiscoverabl'e. 
Finally, almost 60% of the children in* the evaluation disappear 
before th^y exit. The characteristics of rhe reduced s.ample may 
differ from the 'initial sample in ways related to outcomes. 

In this situation we must use auxiliary, information about 
students as well as family, teacher and community information to 
adjust estimates of outcomes. There are a number of ways of 
tapping this auxiliary information to produce less. biased and/or 
more reliable estimates of outcomes. At this time^> however, we 
want to concentrate on a narrow problem: the degree to which 
the addition of pretest ipforroation alters the configuration of 
post- test estimates. 



The Study 

The Sample . In this initial explorat:ion we have ♦utilized 
only a small segment of the data set collected by SRI — a 
subsample of the Summer Study data,; Our ^selection of this 
segment was based on convenience since a work tape had already 
been prepared for the Summer Study. Tlie data have been reduced 
to a manageable subset of variables and reorganized so that it 
can be read by DATATEXT, a social science package of computer, 
programs. This subsample consists of approximately 400 FT and 
NFT children in Philadelphia who entered kindergarten in fall, ^ 
1971. The tape includes test da-ta from fall 1971 and spring 
1972 and the parent interview from fall 1971^ The children 
fall intc one of thiee FT models and their comparison groups: 
Bank Street College of Education (0508) , the University of 
Kansas (0803) and Educational Development Corporation ( Il03) . 
While these three sponsors are only a subset of the FT models, 
they reflect the extremes on a continuum of classroom structure. 
Consequently, we felt that this was a sufficient sample for our 
purposes. 

Techniques of Analysis . We wished to determine if the 
addition of pre-test scores would result in different inferences 
about effects of the three models. What we did in this study 
is contrast each of three MAT subtests given in spring 1972 with 
background information included as covariates, and with both^ 
background information and scores on each of the three WRAT 
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subtests given in fall. 1971 included as covariates. 

TWO separate analyses were undertaken. The first, designed 
to assess mean effects and interactions, was an unweighted ^ 
means ANCOVA without inspection of the within cells regression 
coefficients. Thib means that we assumed that eacih" of the six 
cells of the 2 by 3 design had, the same relationship with a 
set of covariates without examining that assumption. The 
'factors were program (Follow Through, or non-Fol low Through) 
and model (0508, 0803' or. 1103). The ANCOVA adjusts each 
obtained mean by using the regression coefficients, the within 
cell means of the covariates, and the, grand means of the co- 
' variatesl In the ANCOVA we estimated the two main effects and 
their interaction under the assumption of a fixed model. 

The second analysis used the general linear model and 
dummy coded variables to estimate effects for five different 
FT/NFT by site combinations.* The one excluded combination 
was NFT in site 1103 since the inclusion of all six grbups 
in the model would have made the data matrix linearly dependent 
This second analysis is equivalent to an unbalanced oije-way 
ANCOVA under the fixed model assumption with an exact least 
squares solution. Tiie levels of the classification factors 
are fj^e FT/NFT by site combinations. ' 

" "Before presenting results we should discuss ^the variables 
used in l!he study. The three dependent variables were the 
ListeniTug for Sounds/Reading, and Numbers subtests of. the MAT 
Primer test battery. Subtest reliabilities are all between 
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.89 and .96 whether measured by split half reliability corrected 

V 

I 

by the Spearman-Brown formula or a modify.cation of Kuder-Richard- 
son formula 20.* (tow reliability would lead to an enlargement , 
of error variance but would not bias estimates.) 

The three fall WRAT tests used as covariates. were in fact 
proper subsets of the full-length WR?VT subtests. Ite^s were 
deleted from the whole subsets by SRI because they w^e deemed 
too difficult or in some other way unsatisf* ory for thfe , ' 
.children tested. The three SRi WRAT subtests were the Spelling, 
Reading and Math. SRI computed Cronbach*s coefficient alpha 
as a reliability measure for each of three subtests and in all 
cases it was around .95. This means that we do not have to 
worry about 'the bias that might result from not adjusting each 
of the 6rI WRAT subtests by its reliability coefficient since , 
tlie reliaoility is so high that the. adjustment would add little'^ 
accuracy* %o the* analyses.. » 

The background covariates were as follows: 

Age — the child *s age in moftths on September 1, 1972. 
gex — coded as 0 for boys and 1 for girls to allow 

for an effect estimate. 
Preschool experience — the number of months of pre- 
' school experience with a maximum of 36 months. 
Household size — the total number of persons living in 

the child's household. > ' 

Mother's education — the approximate nvimb^^r of -years of 

schooling completed by the child's mother. 
Household income — the approximate amount of income 
[ available to the hpusehold annually.' 

Both mother's education and household income are grouped variables. 
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* Metropolitan Achievement Te^ts Special Report, 1970 Edition 
(New York: Harcourt, Brace and Jovanovich, 1971) . 
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Sinpe the intervals were chosen to- make their variances 

ec^ual, their use in ordinary least squares regressions will 

» 

result in a violatiox) of the assumption of homescedasticit^r 
about the regression surface and often a substantial and arti- 
ficial inctease in R-square.* This violation will, in our 
opinion, not result in enough bias to be worth worrying about," 
especially since adjusting .for the problem would substantially 
incifease the standard error of estimate on these variables. 

. 1 ** ast^ variables are the design vai;^ables. Their 
crossbreak determines the cell sample sizes and are set out 
below. ' - 

Table 1 

FT/NFT by Model Numbers 
Cell Sample Sizes 



Hodel Number 





0508 


0803 


1103 






FT 




"12 = " 


. ni3 «:82 

ft 


n = 
1 + 


238 


NFT . 


n^, « 32 
21 


^22 = " 


"23 = ^« 


n 

2.+ 


113 




n., - 103 


n_j_2 128 


"+3 = 


n - 


351 















The tot,al sample of 3'"^! is that number of children who had valid 



* J. Johnston, Econometric Methods , 2nd Ed. (New YorkJ McGraw- 
Hill, 1971), pp. 228-238. 



scores on all six of the subtGS4:s, the three WRATs and the 
three MATs. The orlMnal eligible pool was composed of 481 
children. 

ft 

Table 2 presents the effects of the ANCOVA. Row and 

• ♦ ' * 

column figures represent the main effeets and cell figures 
represent the interaction effects. For each row, column and 
cell there are two entries? ,the upper one gives the 'adjxisted 
effect after the background covariates have been included in 
the equation and the bottom entry gives the adjusted effect 

» 

after both background and WRAT pre-test covariates have been 
included. The row effects reflect the overall differences 
between Follow Through and non-Follow Through for these models 
The column -effects are the mo(&6l effects — a misnomer since 

i 

only the FT groups have the rabdel. They reflect differences 
iimong the FT and paired comparisons for the three models, and 
do not invblve a comparisoh between FT and NFT. The inter- 
actions in the FT row are thQ.^^eff^ts of each of the three 

models, whiifch can be* compared to each other and to the cor- 

V - 

responding NFT cells. These are the effects we are' most 
interested in since they form the basis for. inferences about 
differential model effects. 

Thu«v instead of trying. to interpret the entire table we 
have summarized the relevant data. Table 3 presents the ad- 
justed means for the six cella under both analyses. The first 
two rows under each subtest show the results without the WRAT 
pretest as a covariate. The second two rows reflect the 
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TABLE 2 

. MAIN EFFHCfs AND INTHRACTIONS 
FOR TWO ANCOVAs ON EACH OF 
THREE MAT SUBTESTS* 



4 



FT 



NFT 



Model Effects 
0 



FT 



NFT 



Model Effects 



FT 



NFT 



Model Effects 



Model Number * 
0508 [ 0803 I 1103 .r 



MAT 



SOUNDS 



1.283 
.8B1 



-1. 283 
- • 881 



.444 
.108 



.270 
.185 



-.270 
-.185 



-1.553 
-1.067 



.741 
I.OIJ 



1.'553 
1.061 



-1.185 
-1.119 



MAT READING 



.412 



.720- 
.412 



.068 
.377 



.278 
.173 



-.278 
-.171 



-.998 
-.582 



.998 
^582 



1.084 
1.290 



-1.016 
-.914 



MAT NUMBERS 



.-.493 
-.787 



.493 
.7^7 



1.176 
.978 



•1.176 
-.978 



-.063 
- .445 



1.742 
2.013 



-.683 
-.191 



.683 
.191 



-1.679 
-1.568 



Program , 
Effects 



-.356 
-.435 
.556 
.435 



.359 
.498 
.359 
.498 



Grand Means 
(Unadjusted) 



.508 
.315 
.508 
.315 



15.698 



14.306 



11.466 



* Upper entries represent background covariates alone; lower 
entries represent baclj^round covariates and WRAT covariates 



results with the WRAT used as a covariate. 

. TABLE 3 f 

ft 

ADJUSTED CELL MEANS 









05 


6»8*c: 


11 








Sounds 




no WRAT 




FT 
NET 

* 


18.069 , 
16.214 


17". 354 ' 
'17.524 > 


13.604 

• 

17.422 


_ . _ ' - 

with WRAT 

» 


FT 
NPT 


17.252 
16.359 


17.459 
17.959 


14^.077 
17.081 


* \ 






Reading 




f 

no WRAT 

• 




FT. 
NFT 


* 14.599 
13.877 


* 

15.308 
15.471 ; 

• 


■ 

11.932. 

7 

14.647 


with V!RAT 


FT 
NFT 


. 13.842 
14.016 


15.269 
15.924 

{ 


12.312 

1^ > 
14.473 






Numbers 




no WRAT 




FT 
NFT 


11.418 
11.387 


14.892 
11.525 


9.612 
9.952 


with WRAT 


FT 
NFT 


10.549 
11.493 

* ** 


14.773 
12.186 


• 

10.0?2 
9 .-773 



Since we are, interested in the effects of the FT models, we 
then took the difference bety^een the FT and WT adjusted means for 
each model on each subtest. Tal>le 4 presents the FT/NFT di^fer- 
ences for thf* two analyses. 

m ■ * ' 
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TABLE- 4 

DIFFERENCES IN ADJUSTED MEANS 
(FT- NET) WITHIN EACH MODEL 



Analysis 


05 


08 


11 








Sounds 




* 


(a) 


1.855 


-.170 


-3. 818 


•* 


(b) 


.893 


-.500 

• w W 


■h' » Vr W ^ 








Reading 






(a) 


.722 


■-.163 


-2.715 




(b) - 


-.174 


-.655 


•;-^.161 






• 


Nunibers 






(a) 


' .031 


3.367 


-.350 




(b) 


-.944 


2.587 


.249 





(a") : analysis without the WRAT pretest 

a 

(b) t aijialysis with the WRAT pretest as a cov. riate 

This table now allows a comparison among the. FT/NET 
differences across models, for the two analyses. For both Sounds 
and Reading, inferences about the relative standing of the 
models are the samG~tmder both analyses. The WRAT covariate 

narrows the separation but doesn/t change the order. Thus> 

/ 

inferences ab6ut the size of thfe differences are affected. For 
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the Numbers subtests. Inferences about the relative standing of 
the mode^ ir^gf^cted, though not dramatically. Modei 08 is 
_-^he highest in both cases, but 05 and 11, change order-;her. the 
WRAT is introduced. Inferences on the size of the dif fcxences 

ar^ also affected. - . p , ' v 

Finally, in all three subtables of Table 4, the effect . 

estimates change by enough to make the inclusion of the thr^e 
^jy^^ p;ce tests seem worth the expense and effort. 

In Table 5, we present the ANOVA parts of the ANCOVA. 
There are again double entries within cells- to represent the 
partial and full sets of covariates. 

The thing to notice about Table 5. is that thfe addition 
of the three WRAT tests to the set of covariates increases 
precision on all three subteits, that is, the mean square 
residual always decreases. Since the degrees of freedom for 
- main ef fecti and interaction te^ts are the same no matter how • 
many covariates we have, other things equal, an increase in 
precision should increase F-ratios for effects and interactions 
and lower the associated ' significance levels. And in eight of 
twelve instances F ratios rise and significance levels fall. 
However, in three instances — Sounds and Reading interactions 
and Numbers program effect - F ratios, fall and ^gnif icance 
levels rise*. This suggests that the amount of bias present 
without the introduction of these three a^Sitional covariates 
is^o grea;: that it more than of fsets the gain in precision. 
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TABLE 5 



^ANCOVA ANALYSIS OF VARIANCE 
TABLE FOR THREE NJAT SUBTESTS 







. Sum of 1 


Mean 










D.F. 


Squares } 


Square 


F-Test 




Significance 






MAT SOUND. 


S 




Progran 


1 


33.910 


33.910 


1.091 




.297 




1 


48.980 


4a^0 


1.840 




.J76 


Site 


2 


202.879 


101.439 


3.265 




. . 040 


* 


2' 


213.965 


106.982 


4.018 




.019 


Program by Site 


. 2 


41Q.816 


205.408 


6.611 




.002 


2 


187.879 


93.939 


'5^3..529 




.031 


Covariates 


6 


502.016 


83.669 


2.693 




.015- . 




9 


2089.867 


232.207 


8,72f2 




under .001 


Residual (Error) 


339 


10533.180 


31.071 








356 


8945.328 


26.623 








Total (After Mean, 


3S0 


11655.504 


33.301 










« 

MAT R E A D I N 


6 




Program 


1 


34 .-563 


34:563 


2.320 




.129 


1 


64.101 


64.101 


5.562 




.019 


Site 


2 


210.426 


105.213 


7.062 




■ ■ .m 




2 V' 


250.263 


125.132 


10.857 




under .001 


Program by Site 


2 


158.922 


. 79.461 


5.354 




.006 


2 


52.150 


26.075 


2.262 




.10^ 


Covariates 


6 


510.016 


85.003 


5.705 




under .001 




9 


* 1687.916 


187.546. 


16.272 




under .001 


Residual (Error) 


339 


5050.547 


14.898 








356 


3872.647 


11.526 








Total (After Mean; 


^56 


5962.102 


17.035 






- 




MAT N U 


M b'^E R S 


t 


* 

Program 


1 


69.148 


69.148 


3.119 




.079 




1 


25.684 


25.684 


1.527 




.218 ■ 


Site 


2 


557.184 


278.592 


12.567 




under .001 




I 


634.160 


- 317.080 


18.846 




under .001 


Program by Site 


2 




103.420 


4.665 




Mi 


2 


153.922 


76.961 


4.574 




.011 


Covariates 


A 


V 795.172 


132.529 


5.978 




under .001 




B 


2657.109 


295.234 


17.548 




under .001 


l^esidual (Error) 


339 


7515.023 


22.168 






f 




356 


5653.086 


16.825 








Total\ (After Mean] 


3^0 


911S 956 


26.054 
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When one also examines the large shifts that occur in som^ of 
the F ratios that rise, it is clear that the addition of the 
pretests yields a less disorted picture of what is going on. 
For this set of analyses, we conclude that the longitudinal 
study is super i/ar to the cross sectional study. 

Tn Table 6 we review the analyses using the general linear 
model. The top row under each subtest presents th# results for 
the analysis without the WHAT pretests. The second row has the 
results with the pretests as covariates. The column headings 

* 

are now the FT/NFT by model combinations and each of the five 
effects is tested on a single decree of freedom. The combina- 
tions are coded as a series, of five dummy variables. The.jjpper 
entry; in each 6ell is the effect estimate. This estimate equals 
zero for the omitted sixth' group. The bottom entry in each cell 
is the sign^ficatrt^e level of t-tests made on 339 degrees of free 
dom for the first analysis (without the WBAT) and 336 degrees of 
freedom for the second analysis. . ' . 

In order to compare model effects we have summarized 
Table 6 in Table 7. Tn this table, the row headings are the 
same. Each cell entry now is the difference between the FT 
effect and the NET effect for each model under each analysis.' 

The FT/^JFT differences in this table are almost identical 
to those in Table 4. Consequently, the cbnclusions drawn from 
Table 4 on page 10 are the same for these analyses. 

/ 



TABLE 6 

« 

GENERAL LINEAR MODEL. EFFECT SIZES 
AND THEIR SIGNIFICANCE LEVELS FOR 
TWO ANALYSES AND THREE MAT SUBTESTS 





FT 0508 


NFT 0508 


FT 0803 


NFT 0803 


FT 1103 


* 


N 


MAT 


SOUND 


S 




Background 
Covariates 


.6477 
over .500 


^ # 3k 9 4 3 
.383 


-.0747 
over .500 


.0720 
over .500 


-3.8204 
.001 


Background 

i WRATS 
Covariates 


.1908 
over .500 


- .7046' 
over .500 


.39 33 
over .500 


.8682 
.461 


-3.0072 
.005 






MAT 


R E A D T N 


G 




Background 
Covariates 


-.0650 
over .500 


-.7516 
.427 


.6444 
.414 


.8126 
.353 


-2.7367 
under .001 , 


Background 
: _ & WRATS 
Covariates 


-.6250 
.385 


-.43^4 
over .500 


.8006 
.259 


1^4580 
.061 


-2.1829 
.002 






MAT 


NUMBER 


* 

S 


t 


background 
Covariates 


1.4303 
.146 


1.4241 
.218 


4.9034 
under. 001 


1.5201 
.154 


-.3740 
over 500 

• 


Background 

& WRATS 
Covariates 


.7742 
.373 


1.7223 
.089 


4.9992 
under. 001 


2. 3951 
.011 


.2267 
over .500 




TABLE 7 , 

FT EFFECT-NPT EFFECT FOR 
' THE GENERAL LINEAR MODE«L FOR 
TWO ANALySH;^ AND THREE MAT SUBTESTS 



08 11 
SOUNDS 
1.8420 -.1467 -3.8204, 

.8954 -.4749 -3.0072 

READING 
.6866 -.1682 -2.7367 

(g) -.1906' -.6574 -2.1829 

NUMBERS 
\3) ' .0062 3,?833 ^-,3740 

-.9481 2,6041 ' .2267 

' analysip without WRAT pretest 

(S)' analysis with WRAT pretest as a covariate 



I 
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Recommendations ' 

The study reported here is just a start. Much more 
work using different strategies and samples needs to be done 
before researchers begin to get a feel for the magnitude of 
the data co? lection effort they must make for conclusions to 
be giyen credence. 

Just confining ourselves to the data from the Follow 
Through Evaluation, there are many ways to sample individual! 
other units of analysis or variaMes from the data tape and 
many ways to subject the different samples" of units and 
variables to quantitative analyses. At this poJ.nt a multi- 
faceted approach may reveal the analysis or analyses best 
suited for a particular quest:! on. , 



